We are migrating the bug tracker to github Issues. This is now the preferred way to report NASM bugs.
Self-registration is disabled due to spam issue (mail gorcunov@gmail.com or hpa@zytor.com to create an account)
When calling from one (multi-section bin format) section to another, with differing vstart values (not one following the other), the call is not flagged by an error or warning. $ nasm -v NASM version 2.14 $ cat test.asm org 256 section entry start=256 call callme align 16 section code vstart=0 align=16 follows=entry times 16 nop callme: $ nasm test.asm -o test.bin -l test.lst $ cat test.lst 1 org 256 2 section entry start=256 3 00000000 E8(1000) call callme 4 00000003 90<rept> align 16 5 6 section code vstart=0 align=16 follows=entry 7 00000000 90<rept> times 16 nop 8 callme: $ ndisasm test.bin 00000000 E80DFF call 0xff10 00000003 90 nop 00000004 90 nop 00000005 90 nop 00000006 90 nop 00000007 90 nop 00000008 90 nop 00000009 90 nop 0000000A 90 nop 0000000B 90 nop 0000000C 90 nop 0000000D 90 nop 0000000E 90 nop 0000000F 90 nop 00000010 90 nop 00000011 90 nop 00000012 90 nop 00000013 90 nop 00000014 90 nop 00000015 90 nop 00000016 90 nop 00000017 90 nop 00000018 90 nop 00000019 90 nop 0000001A 90 nop 0000001B 90 nop 0000001C 90 nop 0000001D 90 nop 0000001E 90 nop 0000001F 90 nop $
Why would this be an error?
I can't imagine a way it wouldn't be an error. Whenever this occurred in my works it was an error. Note that I specified differing vstart values. If the vstart/vfollows values of the sections are such that the section images may be laid out next to each other, and then indeed are so, that wouldn't be an error.
Hmm, it might not be obvious how to decide whether calls between two sections should be allowed. Therefore, I'd like to add a feature request for the bin output format: A section may be given a group= string. If two sections differ in their group, a default-on-and-error warning is generated.
* is generated when a call (near) or jump (near/short) goes from the one section to the other.
What it sounds like you want is really segmentation support in the bin format; it may not be obvious but right now it really doesn't, and is intended to generate a "flat" binary (like ELF or COFF, but unlike OMF/OBJ.) Adding "proper" segmentation support is nontrivial; there is a lot of discussion for how to do it with ELF right now, and although the bin format is all internal to the NASM binary, it is really not fundamentally different in the problem statement, mostly because way too much of the segmentation support that *is* there today is done in the OMF backend instead of in core code. Overall, it has been a huge ongoing project of mine to migrate code out of the backends into common code, but as you can well imagine, it is fairly painful because it touches *all* the backends, most of which are poorly done cut and pastes of one another.
Or is what you are doing something involving overlays?
What I'm doing is that I have sections defined like this: https://bitbucket.org/ecm/ldebug/src/adab3da5025ff00b63381aec2d50bcee150989a8/source/debug.asm#lines-110 cpu 8086 org 100h addsection lDEBUG_DATA_ENTRY, align=16 start=100h data_entry_start: addsection ASMTABLE1, align=16 follows=lDEBUG_DATA_ENTRY addsection ASMTABLE2, align=16 follows=ASMTABLE1 addsection lDEBUG_CODE, align=16 follows=ASMTABLE2 vstart=0 code_start: addsection DATASTACK, align=16 follows=ASMTABLE2 nobits addsection INIT, align=16 follows=lDEBUG_CODE vstart=0 (addsection is a macro defined in lmacros3.mac) Now, when I have eg the function d4message in the lDEBUG_CODE section, I may accidentally "d4 call d4message" in the lDEBUG_DATA_ENTRY section: source$ hg d diff --git a/source/debug.asm b/source/debug.asm --- a/source/debug.asm +++ b/source/debug.asm @@ -805,6 +805,9 @@ call entry_to_code_sel, installdpmi_code .fataldpmierr: +d4 call d4message +d4 asciz "Test",13,10 + mov ax, 4CFFh int 21h source$ build_name=debugx build_options=-D_PM ./mak.sh -D_BOOTLDR -D_DEBUG1 -D_DEBUG4 NASM silently accepts this, generating eg E8 91 A9 at offset 5A91h, that is, "call 0425h", where 0425h is the offset of d4message in the lDEBUG_CODE section. However, this is not valid code, because lDEBUG_CODE is not loaded to the same segment as lDEBUG_DATA_ENTRY, so a near call from one to the other is invalid. What I meant about groups is that NASM may have difficulties telling which sections are loaded where, and whether the call from one to another section is valid. By specifying different "groups" we could tell NASM that a near call from lDEBUG_DATA_ENTRY to lDEBUG_CODE is not valid. This would be useful because we could have several sections (with appropriate vstart/vfollows) loaded to be addressed by the same segment, where near or short branches would be valid. These could be specified with the same "group", so that NASM knows to act as it always does as yet (ie, allow the near call).
To expand on the example, albeit they don't hold code here, the ASMTABLE1 and ASMTABLE2 sections do follows=lDEBUG_DATA_ENTRY respectively follows=ASMTABLE1, with no vfollows or vstart defined. So they're addressed by the same segment as lDEBUG_DATA_ENTRY, and loaded one after another in that way too. (It could be that they do vfollows/vstart similarly but are only actually placed in memory to be addressed by the same segment later.) Then lDEBUG_DATA_ENTRY should be in the same "group" as ASMTABLE1 and ASMTABLE2, so that near calls between these sections are accepted.
> Adding "proper" segmentation support is nontrivial; I don't want anything but an error/warning for the near call that I deem invalid. The "groups" that I suggested are just syntactic means of telling NASM which inter-section near/short branches are valid. Currently, NASM accepts all of them regardless which sections are involved, which I don't want.
First of all, even if jumps are handled, there are all kinds of other memory references which could bite you, and there really isn't any way for NASM to know what should be permitted; for example, is taking the address of a symbol? And now for the painful details why this is a headache implementation-wise... (read: there is just not enough of me...) In the current code the backend doesn't know about jumps, and the front end doesn't actually know about sections. The latter is of course insane, but it obnoxiously requires doing stupidly major surgery on the expression evaluation code (this is stuff that predates my taking over the project in 1999, so 20+ years old code.) The problem is that NASM passes around four kinds of values in a single integer, and sometimes it is even stuffed into a generic, nondedicated integer variable: - Null section (NO_SEG) - Absolute segment reference (in the range 0-0xffff, although the code actually uses a wider bitmask "just because") - A section - A section segment reference (SEG operator or similar) Fixing that would enable pushing a *ton* of code up from the backends into the generic code, so that would be A Good Thing. There is a new backend interface which allows the backend to know about what instructions are coming down the pipe and not just how they are encoded; this is actually required for some corner cases in both ELF and Mach-O, and really, really needs to be plumbed into the various backends, but that is some pretty heavy lifting, too. (Right now, *all* the backends except the debug backend receives data through a converter which converts to the legacy backend interface.) It is pretty much a prerequisite for solving the section problem, too...
> First of all, even if jumps are handled, there are all kinds of other memory > references which could bite you, and there really isn't any way for NASM to > know what should be permitted; for example, is taking the address of a symbol? My suggestion would be purely about handling near and short branches, which are the ones with a displacement. Other references are potentially valid, and are generally of the absolute offset type. Whether RIP-relative addressing should be permitted is to be determined (but I'm an 86-DOS gal so I don't care). As for the remainder of your comment, I thank you for the explanation! If you ever get around to working on that, let's throw my idea onto the wishlist pile.
From a bug during development of my debugger I found that recent nasm sometimes emits round parens into a listing file: https://hg.pushbx.org/ecm/ldebug/rev/2b14b7f9ed90 This is emitted in listing.c when it receives a OUT_RELADDR (as opposed to square brackets for OUT_ADDRESS or OUT_SEGMENT): https://github.com/netwide-assembler/nasm/blob/a916e4127b2eaa3bf40bddf3de9b0ceefc0d98a4/asm/listing.c#L259 Here's a small test case. It turns out that inter-section near calls and short jumps use the round parens while intra-section near calls and short jumps do not use any parens or brackets. In this example, I would usually consider the branches to quuux to be invalid due to the differing vstart while the branches to qux are more likely to be valid. However, in current lDebug there are no round parens emitted in the listing file at all. So to find invalid inter-section branches it would already help a lot if the assembler could warn about every reference which emits an OUT_RELADDR to the listing file. $ cat test.asm section foo call bar call qux call quuux jmp short bar jmp short qux jmp short quuux bar: section baz align=1 times 8 db 0 qux: section xyzzy align=16 vstart=0 quuux: $ nasm test.asm -l test.lst $ cat test.lst 1 2 section foo 3 00000000 E80C00 call bar 4 00000003 E8(0800) call qux 5 00000006 E8(0000) call quuux 6 00000009 EB04 jmp short bar 7 0000000B EB(08) jmp short qux 8 0000000D EB(00) jmp short quuux 9 10 bar: 11 section baz align=1 12 00000000 00<rep 8h> times 8 db 0 13 qux: 14 15 section xyzzy align=16 vstart=0 16 quuux: $ podhex test 000000 E8 0C 00 E8 11 00 E8 F7-FF EB 04 EB 0A EB F1 00 >................< 000010 00 00 00 00 00 00 00 >.......< 000017 $
This patch works for me: nasmlist$ git diff diff --git a/asm/listing.c b/asm/listing.c index 186b8b4e..4e6d6176 100644 --- a/asm/listing.c +++ b/asm/listing.c @@ -257,6 +257,13 @@ static void list_output(const struct out_data *data) offset += size; break; case OUT_RELADDR: + /*! + *!out-reladdr [off] inter-section relative relocation + *! warns that a relative relocation crosses a + *! section boundary. + */ + nasm_warn(WARN_OUT_RELADDR, + "inter-section relative relocation"); list_address(offset, "()", data->toffset, size); break; case OUT_RESERVE:
Having this as an optional default-off warning seems reasonable, although it is probably in the wrong place in the code. The usage case seems to be pretty unique to your case, but that doesn't mean it isn't a valid request.
Code checked in: 73676357de6bf4028a9ccfb0bd443ed57935870c This should be in nasm-2.16.02-rc2.
This is at https://github.com/netwide-assembler/nasm/commit/e64ae0a0c68edccf544c915a6cc65a562f1629b4 > This usually normal, but may not be handled by all possible target environments I think you're missing an "is" in all these lines.
Oops, my old tab seems to have changed the resolution status.
The wording of the warning may be a little odd for reloc-abs-*. Note that dw foo is not "section-crossing" as I understand it (both the dw and the foo label are in section foo), but it does emit an OUT_ADDRESS hence this warning displays. So I would suggest rewording the abs warnings. $ cat test.asm section foo foo: dw bar dw foo section bar bar: $ nasm test.asm -w+reloc test.asm:4: warning: 16-bit absolute section-crossing relocation [-w+reloc-abs-word] test.asm:5: warning: 16-bit absolute section-crossing relocation [-w+reloc-abs-word] $
By the way, since this has been added to the NASM 2.16.02 branch I have found another use for these new warnings: The eldstrict mode for my Extensions for lDebug (ELDs) requires to mark all relocations with either the reloc or reloc2 mmacros. It uses the reloc-abs warnings to flag all unmarked relocations as errors during build time. (The reloc-rel warnings are always enabled for all sources of the debugger.) This is needed because I am (mis)using NASM's multi-section -f bin output format to create the ELDs, which are "position-independent" in that an ELD's data section and code section must be loaded at a dynamically-chosen offset each. Each ELD carries a linker which runs first thing when the ELD is loaded and fixes up all the relocations. So internal relocations need to be marked by "internaldatarelocation" or "internalcoderelocation" afterwards. External relocations to the debugger host program's variables are marked by "linkdatarelocation" macros instead, and the unlinked relocations refer to "relocateddata" which is in an empty nobits section. External relocations to the debugger's functions are done using several more macros, "extcall" and "extcallcall". The eldstrict mode is mentioned on my blog: https://pushbx.org/ecm/dokuwiki/blog/pushbx/2023/1119_mid_november_work#eld_strict_mode_reloc_only_disables_reloc-abs_warnings Here's the macro file implementing the eldstrict mode: https://hg.pushbx.org/ecm/ldebug/file/0b8b44f1088a/source/eld/eldcheck.mac
=== Quoth https://bugzilla.nasm.us/show_bug.cgi?id=3392678#c1 I just enabled the NASM 2.16.02 reloc-abs-byte warning as an error in my macro collection: https://hg.pushbx.org/ecm/lmacros/rev/54a8c35131aa The lDebug, ldosboot boot.asm, ldosboot iniload.asm, instsect, and tsr sources all build fine with this change. I did test that eg "int cmd3" does trip the error as desired. So that works as a workaround to this issue. The new warning was added as one of a class of warnings in response to my report at https://bugzilla.nasm.us/show_bug.cgi?id=3392571